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Abstract — When two or more users in a wireless network 
transmit simultaneously, their electromagnetic signals are linearly 
superimposed on the channel. As a result, a receiver that is 
interested in one of these signals sees the others as unwanted 
interference. This property of the wireless medium is typically 
viewed as a hindrance to reliable communication over a network. 
However, using a recently developed coding strategy, interference 
can in fact be harnessed for network coding. In a wired network, 
(linear) network coding refers to each intermediate node taking 
its received packets, computing a linear combination over a finite 
field, and forwarding the outcome towards the destinations. Then, 
given an appropriate set of linear combinations, a destination can 
solve for its desired packets. For certain topologies, this strategy 
can attain significantly higher throughputs over routing-based 
strategies. Reliable physical layer network coding takes this idea 
one step further: using judiciously chosen linear error-correcting 
codes, intermediate nodes in a wireless network can directly 
recover linear combinations of the packets from the observed 
noisy superpositions of transmitted signals. Starting with some 
simple examples, this survey explores the core ideas behind this 
new technique and the possibilities it offers for communication 
over interference-limited wireless networks. 

Index Terms — Digital communication, wireless networks, in- 
terference, network coding, channel coding, linear code, modula- 
tion, physical layer, fading, multiuser channels, multiple access, 
broadcast. 



I. Introduction 

In recent years, the number of wireless devices has skyrock- 
eted and, to handle the demands of ever richer multimedia 
applications, these devices have required higher and higher 
data rates. These trends, coupled with the scarcity of spectrum, 
imply that interference between devices will be one of the 
dominant bottlenecks in wireless networking for many years 
to come. In some cases, this interference is purely an obstacle 
to reliable communication. However, in many scenarios, it 
is actually possible to harness interference to enable more 
efficient communication over a network. In this survey, we 
examine a set of novel strategies geared at exploiting wireless 
interference for reliable network coding. 

Nodes in a network occupy one or more of the following 
roles: sources transmit information packets into the network, 
destinations are interested in recovering a set of information 
packets, and relays help move information between sources 
and destinations. In a wired network, he classical approach 
is to have relays forward a subset of their observed packets 
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towards the intended destinations. For a wired network with 
a single source, multiple relays, and a single destination, this 
routing strategy is optimal |]Tj, Q. More generally, routing 
cannot attain the maximum throughput and relays may need to 
send out functions of the packets they observe, rather than just 
repeating them. This network coding strategy was originally 
developed by Ahlswede et al. for optimal multicasting over 
wired networks |^| and has turned out to be quite useful for 
a wide array of networking scenarios. Much of this research 
has focused on linear network coding strategies, where the 
functions are assumed to be linear combinations of the packets, 
taken over an appropriate finite field Q, |j5]. 

In a wireless setting, transmitting a packet from one node 
to another naturally causes interference to all nearby nodes. 
If multiple nodes transmit concurrently, their waveforms are 
linearly superimposed which makes it harder for a receiver to 
recover its desired packets. Yet, for network coding, relays 
do not need to recover the contents of individual packets, 
only an appropriate functions thereof. In this overview, we 
will look at physical layer modulation and coding techniques 
that can harness the linear nature of wireless interference for 
linear network coding. If the relays can transmit in a fully 
analog fashion, one possible approach is to have them repeat 
their observed noisy linear combinations directly. While this 
has favorable properties when the signal-to-noise ratio is high 
enough, it is clear that the ensuing end-to-end noise accumula- 
tion is highly undesirable. A more interesting question is thus 
whether the noise can be removed at each stage by appropriate 
error-correcting codes. This is what we will refer to as reliable 
physical layer network coding in this paper 

Interestingly, an initial information-theoretic analysis might 
suggest that such coding is not feasible and instead, relays 
must first decode the individual packets. In other words, 
attempting to directly decode only a linear combination of the 
messages will implicitly also reveal the individual messages, 
and thus, not be any more efficient than the standard approach. 
Fortunately, this initial attempt is too pessimistic. The key 
insight is that the modulation and coding strategies should 
share a common algebraic structure across transmitters. More 
precisely, if the transmitted waveforms are points of a lattice, 
then every integer combination of these waveforms is itself a 
point of the same lattice. Therefore, receivers can efficiently 
decode these linear combinations with the same framework 
used to decode individual packets. How efficiently, depends on 
how closely the coefficients of the desired linear combination 
match the observed channel strengths and phases. 

Reliable physical layer network coding thus involves two 
complementary questions: (i), how to enable encoders and 
decoders to exploit interfering signals for efficient function 
computation, and ( ii), at the network level, which functions to 
select in order to enable efficient overall information transfer. 
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As we will demonstrate, existing modulation techniques and 
linear error-correcting codes can serve as building blocks 
for these new encoders and decoders. Information can be 
encoded digitally into packets at the transmitter side and 
decoded directly into a linear combination of packets at the 
receiver side. We will illustrate the basic ideas behind this new 
approach starting with very simple examples and gradually 
incorporating many of the aspects of wireless channels. In 
keeping with the survey nature of this paper, we will point to 
relevant papers in the literature along the way. 

II. Network Coding Preliminaries 

Consider a network of several nodes, some of which are 
linked together by wired connections. If one node wishes to 
send a message to another node in the network, then it is 
optimal to simply route the information towards its destination: 
intermediate nodes should simply retransmit their received 
packets. This strategy can achieve the unicast capacity of a 
network which is given by the max-flow min-cut theorem, as 
shown independently by Ford and Fulkerson ||T] and Elias, 
Feinstein, and Shannon |2|. Now, suppose that more than one 
destination wants the transmitted message. The seminal paper 
of Ahlswede et al. demonstrated that routing is insufficient 
for this problem and network coding is, in general, required 
to achieve the multicast capacity |3|. The key principle under- 
lying network coding is that intermediate nodes should send 
out functions of their received packets, instead of the packets 
themselves. Subsequent work by Li, Cai, and Yeung \4j and 
Kotter and Medard |5j made the important observation that, 
for multicasting, intermediate nodes can simply send out a 
linear combination of their received packets. There is now 
a wealth of literature on the myriad applications of network 
coding to sending information over networks and beyond. A 
comprehensive literature survey is beyond the scope of this 
paper and we refer the interested reader to the other papers in 
this issue as well as to several books on the subject |6l-fT0). 

For the purpose of our exposition and to make the ideas 
behind physical layer network coding apparent, we need 
to develop network coding slightly more formally here. In 
particular, we will consider operations on a finite field, i.e., a 
set of q elements that we will denote without loss of generality 
by {0,1,2, ... ,q— 1}. For ease of exposition, we will assume 
that q is a prime number so that addition and multiplication 
over the finite field can be written as modulo addition and 
multiplication over the reals. For any two integers a and b in 
this set, we will denote addition and multiplication modulo-g 
as 



a(Bb = [a + b] mod q 
a (8) 6 = [ab] mod q . 



(1) 
(2) 



wi, W2, . . . , Wi. The node's role in a network coding solution 
is to send a linear combination u of these packets towards the 
destination: 



U = fliWi a2W2 I 



1 alWl 



(3) 



where ai, 02, . . . , a^, are coefficients over the finite field. The 
goal is for each destination to collect enough linear combina- 
tions to infer the original packets. Assume that a destination 
has successfully received linear combinations Ui , U2 , . . . , Um 
where 



(4) 



Then, it can solve for the original packets if the matrix of 
coefficients 
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(5) 



has rank L. Since we would like all destinations to recover the 
message, we must choose the network coding coefficients at 
each relay so that the matrix of coefficients at each receiver is 
full rank. Jaggi et al. developed an efficient algorithm that can 
find a feasible set of coefficients in polynomial time so long 
as the field size q is larger than the number of receivers 
Another powerful approach advocated by Kotter and Medard 
as well as Ho et al. is to generate the coefficients randomly at 
each relay ||5), | [T2| . It can be shown that the probability that 
this yields a valid solution increases with the field size. 

It is also instructive to note that within this framework, the 
routing solution corresponds to having the intermediate node 
retransmit one of its received packets. 



u = W£ for some I {1,2, ... ,L} 



(6) 



during each time slot. 



A. Two-Way Relay Channel 

We now introduce a simple network, the two-way relay 
channel, that will serve as a guiding example and benchmark 
for all of the strategies in the sequel. To the best of our 
knowledge, this example first appeared in a paper by Wu, 
Chou, and Kung in 2004 1 1 3 1 . As shown in Figure [T] there 
are two users that wish to exchange messages with each other 
However, in this model, we assume that each user cannot 
hear the other user's transmission. Instead, the users must 
communicate with the help of a relay node "in the middle" 
that can receive from and transmit to both users. 



We will work with the algebraic network coding framework 
introduced in |5|. The transmitting terminal has a message 
which can be represented as a string of bits. This message 
can be broken up into several packets each of which can be 
written as a length-fc vector of elements from the finite field 
which we will denote by € F^. Say an intermediate node 

(or relay) in a network has received some of these packets Fig- l- Two-way relay channel. 



Has wi 
Wants W2 



Relay 



Has W2 
Wants wi 
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The users share the same frequency band so, if both users 
transmit simuhaneously, the relay will observe a superposition 
of the two signals corrupted by noise. This interference effect 
can be modeled by a multiple-access channel with inputs Xi 
and X2 and output yr. Each user m — 1,2 generates its 
channel input x„i from its message w,„ and the channel output 
yi is observed by the relay (see Figure |2|i. Conversely, any 
transmission by the relay will be heard by both users. This 
broadcast effect can be modeled by a broadcast channel with 
input xr and outputs yi and y2. 



Wi- 
W2- 



Multiple-Access 
Channel 



yi 



X3 



Broadcast 
Channel 



y4 



Fig. 2. Two-way relay channel model. 

These two channel models have been studied in depth 
over the past few decades. From an information-theoretic 
perspective, the capacity region for sending messages over 
a multiple-access channel has been completely characterized 
p4j , 1^15]. The capacity region of the broadcast channel is 
also known if the channel is "stochastically degraded" p6)- 
|[T8). This condition holds for the wireless channel models 
considered in this paper. We will not delve into the subtleties 
of these capacity results and refer the interested reader to [19]- 
| [2T] for more details. Roughly speaking, these results tell 
us that for wireless multiple-access and broadcast channels 
at symmetric operating points, each user can attain a rate 
inversely proportional to the number of active users. For the 
two-way relay channel, this means that sending two messages 
to the relay takes approximately twice as much time as sending 
one message. Similarly, sending two different messages from 
the relay to two destinations takes twice as much time as 
sending one message to one destination. 

We make the natural assumption that each terminal must 
operate in half-duplex mode (i.e. it can either send or receive 
during a single time slot but not both). From this, we get that 
combining standard physical layer coding ideas with routing 
allows both users to exchange messages over the two-way 
relay channel in 4 time slots. We illustrate how this can be 
done in Figure [3] Each user takes a time slot to send its 
message to the relay and the relay takes two time slots to 
send these messages to their destinations. 

This performance can be significantly improved through the 
use of network coding |13|. Once the relay has collected wi 
and W2 (after two time slots) it can easily compute the modulo- 
2 sum of the messages wi ® W2. In the next time slot, it can 
broadcast the sum to both users. Each user can then infer its 
desired message from the sum and its original message. This 
network coding strategy thus allows the users to exchange 



(a) 
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■W2 



(b) 
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(c) 
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Fig. 3. A routing strategy for the two-way relay channel that requires 4 time 
slots, (a) During the first time slot, user 1 sends its message wi to the relay. 

(b) During the second time slot, user 2 sends its message W2 to the relay. 

(c) During the third time slot, the relay sends the message wi to user 2. (d) 
During the fourth time slot, the relay sends the message W2 to user 1. 



messages in only 3 time slots as shown in Figure [4] This gain 
is not in conflict with the capacity region of the broadcast 
channel as the relay only needs to send out one common 
message rather than two distinct messages. 

Remark 1: Broadcasting a common message is limited by 
the weakest channel from the relay to a single user. We also 
note that in general, it is not optimal to send a common 
message comprised of the sum of the bits p2)-p5), but, for 
the channel models we will consider, it wiU suffice. 



/55 ^^ 

(a) a 



W2 



(b) 



Wl 



Wl 



(c) 



Wl 



Wl W2 



W2 



Wl ® W2 



Fig. 4. A network coding strategy for the two-way relay channel that requires 
3 time slots, (a) During the first time slot, user 1 sends its message wi to 
the relay, (b) During the second time slot, user 2 sends its message W2 to the 
relay, (c) During the third time slot, the relay sends the sum of the messages 
Wl © W2 to both users. 



The two-way relay channel is just one of the many see- 
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narios where the broadcast property of the wireless medium 
can be exploited via network coding. This behavior has 
been thoroughly investigated theoretically [13] , |,25J-|,35J and 
demonstrated in practice |36|. 

Now that we know it is more efficient for the relay to send 
the sum of the messages during the broadcast phase, it is 
natural to ask whether savings are also possible during the 
multiple-access phase. Since the relay only needs the sum, 
we could do even better by conveying the sum to the relay 
without identifying the individual messages. Simultaneously 
transmitted signals are added up on the wireless channel and, 
as we will show, this property can be exploited to send the sum 
(or another linear function) to the relay in a single time slot. 
Figure |5] illustrates how this scheme allows users to exchange 
messages in just 2 time slots. The remainder of this paper 
is devoted to showing how this is possible and characterizing 
the exact gains. We will start with some rudimentary examples 
and gradually build up a toolkit for general networks. 



(a) S 

FT] 



(b) 



Wi 



Wi © W2 



W2 



Fig. 5. A physical layer network coding .strategy for tlie two-way relay 
channel that requires 2 time slots, (a) During the first time slot, the users 
send the sum of their messages wi © W2 to the relay using a nested lattice 
scheme, (b) During the second time slot, the relay sends the sum of the 
messages back to both users. 

Remark 2: Note we did not take into consideration the dis- 
tributed scheduling problems that arise in wireless networks. 
For instance, in the backoff protocol of the IEEE 802.11 
standard, users listen to see if the channel is free before 
they transmit. If the channel is in use, they remain silent for 
a random interval before listening again. We are primarily 
interested in the gains due to novel signaling schemes and 
will assume ideal scheduling. The interaction between these 
new schemes and practical scheduling algorithms is beyond 
the scope of this paper 

III. A Finite Field Physical Layer 

We start our discussion by considering a hypothetical phys- 
ical layer that is particularly well-suited to the standard linear 
network coding, yet serves to illustrate some of the key 
properties and effects that can be exploited over more realistic 
physical layers, such as the wireless case discussed in the 
second part of this paper This finite field model will help 
build intuition for the more intricate strategies used later on. 

A. Noiseless Interference 

Specifically, let us first consider a channel model that 
is completely noise-free, and where the transmitted signals 



interfere in a modulo-additive way. That is, each transmitted 
symbol takes values in {0, 1, 2, . . . , g — 1}, where we assume 
that g is a prime number Let xt\t\ denote the symbol transmit- 
ted by the user in time slot t. Separately for each time slot, 
the physical layer provides, as its channel output, the modulo- 
q sum of all the input signals from all L transmitters during 
that time slot: 



y[t] =Xi[t\®X2[t]®---®XL[t] 



(7) 



We find it convenient to consider blocks of n time slots jointly, 
which we represent in vector notation: 



X£ = [xt[\\ xt[2] ■ ■ ■ xt[n]\ 
y=[y[l\ y[2] ■■■ 



(8) 
(9) 



where ^ is the transpose operator. Thus, the channel output 
can be expressed as 



y = xi ® X2 ' 



(10) 



It is quite simple to exploit this channel for linear network 
coding. Each transmitter should just pre-multiply its packet 
by an appropriately chosen coefficient ai out of the set 
{0, 1, 2, . . . , g — 1}, where multiplication is again modulo-g, 
and transmit the resulting signal on the physical layer. 



X£ = a£Wj» . 



(11) 



The channel output is then exactly equal to our desired linear 
combination 



(12) 



In one shot, the receiver learned exactly what it needed to 
make the network code work and not one bit more. This 
should be contrasted to the standard approach in which the 
transmitters would take turns, each sending its entire packet 
to the receiver, who would then compute the desired linear 
combination (and forward it). Clearly, the latter would take L 
times longer to complete. Therefore, for this very particular 
"physical" layer, the described (simple and obvious) scheme 
attains a speedup of a factor of L, which can also be shown to 



be the optimal attainable performance (see Section III-C2 1. It 



will be convenient to capture the performance of the computa- 
tion scheme by what we will refer to as the computation rate, 
namely, the number of bits of the linear function successfully 
recovered per channel use. For the present example, this 
evaluates to 

^coMP = log2g (13) 
whereas sending the data separately can only attain 



-RcOMP 



L 



log2g 



(14) 
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Fig. 6. A noisy modulo-adder cliannel. 
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B. Interference with Erasures 

Let us now bring this model one step closer to physical 
reality. In particular, we will add noise into the picture and 
show that in some interesting cases, exactly the same speedup 
is possible. We begin with an erasure channel. Say that out 
of every block of three symbols put into the channel, one is 
chosen at random and erased. If we just transmit uncoded, 
some of the symbols will be lost and we will fail to achieve 
our goal of error-free network coding. We can overcome this 
problem with an error-correcting code that adds a parity-check 
to every two symbols. Let hi and 62 be symbols taking values 
in {0, 1, 2, . . . , g — 1}. The code maps these two information 
symbols into three symbols 



[bi 62 bi ® 62] 



(15) 



which can then be transmitted over the channel. If one of these 
three symbols is missing, it can be recovered from the other 
two so we can reliably transmit symbols over the channel. 
Now, consider a two-user channel like that shown in Figure [6] 
The output is the mod-g sum of the two users' transmissions, 
except that one out of three symbols is randomly erased. In 
Figure [7] we illustrate a standard approach to error control over 
a noisy multiple-access channel: each user encodes its own 
symbols and is allocated its own time slots for transmission. 
Our ultimate goal is to reconstruct the linear combinations 
&i®ci and 62ffiC2 at the decoder. Transmitting all the symbols 
reliably to the receiver requires a total of 6 channel uses as we 
need one parity symbol per user to recover from the erasure. 
However, as shown in Figure [8] if both users transmit their 
coded symbols simultaneously, this can be accomplished in 3 
channel uses. The parity checks hi 62 and ci © C2 will be 
added up by the channel. Therefore, the receiver will observe 
&i ® ci ©62 ©C2, which serves as a parity check on the desired 
linear combinations. In other words, the channel combines 
the original parity checks in exactly the right way. In fact, 
if all the information was available at a single transmitter, we 
would pre-compute the linear combinations and use the same 
parity check to protect them. In summary, the key observation 
is a speedup (or, equivalently, capacity gain) proportional to 
the number of transmitting terminals (two, in the example 
just discussed). More explicitly, the computation rate for this 
scheme is 



ft 



COMP 



log2 q 



(16) 



and the computation rate resulting from sending all of the data 
is 



-RcoMP — 2 log2 q 



(17) 



Fig. 7. Reliable computation over a modulo-adder with erasures. Users take 
turns sending their data. Afterwards, the receiver computes the desired sum. 
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Fig. 8. Reliable computation over a modulo-adder with erasures. Users send 
their data at the same time and the receiver directly infers the desired sum. 



C. Interference with Modulo-Additive Noise 

The key idea in the example above is that not only does 
the channel naturally compute the linear combinations of the 
information symbols, it also computes the parity checks for 
them. As it turns out, this idea can take us quite far for noisy 
modulo-adder channels. 

1) Algebraic Coding Perspective: We find it instructive to 
start the discussion by considering classical algebraic error- 
correction codes. To this end, let G be an n x A; generator 
matrix for a linear code that can correct up to d errors. Both 
users in Figure |6] encode their respective messages using this 
generator matrix, meaning that they will choose channel inputs 
xi = Gwi and X2 = Gw2, respectively. Now, the the channel 
output can be expressed as 



Xi © X2 © Z 

Gwi © Gw2 ' 
G(wi © W2) ( 



) z 
) z 



(18) 
(19) 
(20) 



where the last step follows from the distributive property 
of matrix multiplication. The key observation is that the 
generator matrix G directly protects the modulo-sum of the 
two messages. Therefore, as long as there are no more than d 
errors, i.e., as long as the Hamming weighj^of the error vector 
z is no more than d, the modulo-sum of the messages can 
be perfectly recovered. It is important to note that this again 
represents a speedup of a factor proportional to the number 
of users (two, in this example) over the standard approach of 
sending the full messages to the decoder. 

2) Information-Theoretic Perspective: Next, we discuss 
how this idea naturally extends beyond the fixed error model 
of classical error-correction codes to models with random 
errors. To begin, it is insightful to consider the standard con- 
nection between such codes and classical information theory. 
The standard information-theoretic approach for showing the 

'The Hamming weight of a vector is the number of symbols that are not 
equal to zero. 
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achievability of a certain rate of transmission is via the so- 
called random coding argument, a version of the probabilistic 
method. First, codewords are drawn element-by-element in- 
dependently from a fixed probability distribution. Then, it is 
shown that the decoding error probability at the receiver is very 
small as long as the number of codewords is small enough. 
This involves a union bound over all codewords. Finally, it 
can be argued that since the probability of error is small on 
average over the random codebook, there must be at least 
one good fixed codebook (see p9| Theorem 7.7.1] for more 
details). Clearly, the resulting randomly chosen codebook used 
in this argument has no algebraic structure whatsoever (with 
probability one). 

However, there is an alternative argument that also permits 
to establish achievable rates of communication and that does 
involve algebraic structure. Here, one starts by randomly 
drawing each element a generator matrix G independently 
from the uniform distribution over {0, 1, 2, . . . , g — 1}. We 
note that by this construction, the elements of all codewords 
will be uniformly distributed over the entire q-ary alphabet. It 
can be shown with a little more work that the resulting code 
has pairwise independent codewords (see p7] Section 6.2]) so 
one can still take a union bound as above. The main trick is 
to let each transmitter employ this same code: 

X£ = aeGvfi . (21) 

The channel output then looks as if the desired function u was 
directly encoded with G: 

y = G (aiWi © • • • ©ai,WL) ©z (22) 
= Gu © z . (23) 

Since the codewords are pairwise independent, standard 
information-theoretic arguments can be applied. It can be 
shown that the probability of decoding error can be made 
arbitrarily small (by increasing the coding blocklength) so long 
as the rate is smaller than the mutual information between the 
surrpl of the inputs Xi © • • • © and the output Y, which 



can be calculated as follows: 

/(Xi©---©Xi;r) (24) 
^H{Y)-H{Y\Xi(B---(BXl) (25) 
= H{Y) ~ H{Xi © • • • © © Z\Xi © • • • © Xi) 
= HiY) - H{Z) . (26) 



For this random linear code construction, the channel inputs 
are uniformly distributed over all q letters, and thus, Y is also 
uniformly distributed over all q letters, meaning that H{Y) = 
log2 q. Thus, any rate up to 

i?coMP - log2 q - H{Z) (27) 

bits per channel use is achievable. For comparison, sending all 
of the data separately and then evaluating the function requires 

^One intriguing mathematical issue tliat arises here is that the usual proof 
techniques, such as random coding, are not able to take advantage of the 
channel between the sum of the inputs and the output: algebraically structured 
codes seem to be necessary (38|. 



L times more channel uses, resulting in a rate 

Rcou?=\{\og2q-H{Z)) . (28) 

bits per channel use. 

The idea of using the same linear code at each encoder 
for this setting was introduced by us in a 2005 paper p9[ 
(see | j40j for the journal version). Our inspiration came from a 
paper by Komer and Marton which uses this technique for the 
distributed compression of the parity of two dependent binary 
sources |41 1. The gains in their setting come from the degree 
of dependence between the two binary random variables. In 
our case, the gains come from eliminating the need to send 
all the data to the receiver^ 

We now briefly discuss the information-theoretic optimality 
of the proposed coding technique for the special case of linear 
modulo-additive channel models. For such channels, the rate 
cannot exceed 

R< max I{Xi,...,Xl\Y) (29) 

p{xi,...,xl) 

as this is the best performance attainable if all trans- 
mitters could fully cooperate. Note that the maximization 
is over the probability distribution of the channel inputs 
[Xi, X2, ■ ■ ■ , Xl). Rewriting this mutual information expres- 



sion in terms of entropies yields 

I{Xi,...,Xl;Y) (30) 

= H{Y)-H{Y\Xi,...,Xl) (31) 

^ H{Y) - H{Xi © • • • © © Z\Xi, ...,Xl) (32) 

^ H{Y)- H{Z) (33) 

<log2<7-i?(Z) . (34) 



where the inequality is due to the fact that the entropy cannot 
be larger than the logarithm of the alphabet size. This means 
that in the special case of the modulo-additive channel, the 
proposed code attains the best possible computation rate, i.e. 
the computation capacity. 

D. Beyond Finite Field Models 

This coding strategy can certainly be employed in channels 
that cannot be represented as a noisy finite field sum of their 
inputs. Although it may not always attain the capacity, it 
often provides a superior performance than sending all of 
the data to the receiver Assume each channel input takes 
values on {0,1,2,... ,(7 — 1} (otherwise relabel the symbols 
appropriately). We use the same encoding procedure as before 
by having each transmitter send X£ — a^Gw£. It can be 
shown that the receiver can reliably decode the function 
oiwi © • • • © glwl so long as the rate is less than the mutual 
information between this desired function and the channel 
output: 

RcoMP < HaiXi © • • • © aLXL;Y) (35) 

'in fact, we can also take advantage of the dependencies between messages 
while exploiting the channel's natural computation. To simplify the presenta- 
tion, we have assumed throughout that users' messages are independent. See 
|40| for more details. 
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where Xi, . . . , Xl are uniformly distributed. See | |42| Theo- 
rem 7] for more details. Another possible strategy is to have 
each transmitter send its data uncoded simultaneously and then 
take turns sending parity checks while avoiding collisions. In 
some cases, this does better than using the same linear code 
at each transmitter We studied this strategy in more depth 
in pO) , | [42) under the moniker of "systematic computation 
coding." 

While channels that operate over the finite field may seem 
a bit contrived, they can be quite useful to model certain 
wireless scenarios. For instance, in Section |V-A| we will 
explore a scheme in which the receiver makes a hard decision 
on the modulo-2 sum of the transmitted bits. There is some 
probability, depending on the signal-to-noise ratio, that this 
estimate of the sum is in error. Thus, after the hard decision, 
the channel is precisely a noisy modulo-2 adder channel and 
the coding techniques developed above can be used to denoise 
the modulo-2 sum of the bits. 

Finite field models can also be used to create accurate 
approximations of certain classes of wireless networks as 
shown by Avestimehr, Diggavi, and Tse [43 J . Several groups 
have studied network coding strategies in the context of finite 
field physical layer models including |[44)-|[48). 



IV. The Wireless Medium 

The structure and interpretation of the physical layer is 
of key importance for the ideas presented in this paper. For 



example, in Section III we considered a finite field "physical" 



layer that could be easily incorporated into the network code 
construction, leading to significant speedup (or, equivalently, 
capacity gains). Much of the remainder of this paper is devoted 
to a physical layer that is of particular current interest, namely, 
the wireless medium. Therefore, we first briefly revisit its 
standard models and properties. We will refrain from a deep 
discussion of this, referring the reader instead to a host of 
textbooks and monographs on the topic, including | |49) , pOj . 
However, there are three key observations that are important 
for the techniques discussed in this paper, and we here review 
them in turn. 

1 ) Signal fading is linear: Between the transmitter and a 
receiver, an electromagnetic signal undergoes a (potentially 
time-varying) linear transformation. This transformation is 
primarily induced by reflections and multi-path propagation. 
Assuming band-limited communication, the respective signals 
can be represented uniquely by complex-valued discrete-time 
samples. Then, the received signal at any point in space can 
be expressed as the convolution of the transmitted signal 
with an impulse-response function that characterizes the signal 
propagation. Short of modeling exactly the physical surround- 
ings, a popular approach is to model this impulse-response 
function in a statistical fashion. Assuming flat (meaning fre- 
quency non-selective) fading, as would be appropriate for 
narrow-band communication, this impulse-response function 
reduces to a delta function whose height characterizes the 
signal propagation. In this particularly simple and commonly 
studied model, the induced signal at any point in space can 
be expressed as where X[t] is the signal transmitted 



in time slot t. The random fading h is often modeled as 
a (circularly symmetric complex) Gaussian random variable, 
though this is not fundamental for the exposition here. More 
importantly, however, it is usually assumed that the fading h 
is known exactly to the receiver, which is motivated by signal 
measurements that can be acquired at the receiver We will 
also make this assumption throughout this paper. A natural 
follow-up question concerns whether h is also known to the 
transmitter. In this paper, we will generally assume that the 
transmitter is ignorant of h. 

2) Multiple signals interfere in a linear additive way: This 
second point is a direct extension of the first one. Namely, 
consider now that L transmitters are active simultaneously. 
Then, along the same lines described above, the induced signal 
at any point in space can be expressed as X^ti heXi[t]. It is 
commonly assumed that the respective fading coefficients hg 
are independent of each other, each following a Gaussian law, 
and we will follow this assumption throughout, although it is 
not fundamental for our main arguments. 

3) Noise is independent of the signal and added at the 
receiver: More particularly, in line with the standard models, 
we will assume that the noise distribution is described by a 
(circularly symmetric complex) Gaussian law, although again 
this is not fundamental for the ideas laid out here. 

These three key observations, together with several more 
detailed considerations, lead to the following commonly used 
model for the signal at a particular receiver when L transmit- 
ters are simultaneously active: 



Y[t] 



L 

E 

1=1 



heXi[t] + Z[t] 



(36) 



As this model shows, the undesirable element for linear 
network coding on the physical layer is the noise: Generally, 
it will add up over the various stages of the network, and thus, 
suitable reliable coding is necessary. 

A. Gaussian Channel Capacity 

Consider the special case of the model above where there 
is only one active transmitter This is the classical Gaussian 
channel: 



Y[t] = hX[t] +Z[t] 



(37) 



if Z[t] is taken to be independent and identically distributed 
(i.i.d.) circularly symmetric complex Gaussian noise with 
variance cr^. To model the fact that the transmitter has a limited 
power budget, it is usually required that the transmitted signal 
satisfies 



1 



\X[t]\'<P 



(38) 



C = log2 1 



The seminal paper of Shannon showed that the capacity of 
this channel is 

bits per channel use | [5T| . Of course. Shannon only showed 
the existence of good block codes, not any explicit construc- 
tions. After more than sixty years of research, codes with 
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low complexity encoding and decoding algorithms have been 
developed that can come quite close to the capacity. Describing 
these codes is far beyond the scope of this paper so we point 
the interested reader to a survey by Forney and Costello that 
appeared in an earlier issue of these Proceedings (52^. For our 
purposes, we will assume the existence of good encoders and 
decoders that can achieve the channel capacity. One important 
aspect of a capacity-achieving code is that the elements of each 
codeword look as if they were sampled i.i.d. from a Gaussian 
distribution with variance P. In Section |lXl we will describe 
some very recent efforts to design practically viable codes for 
physical layer network coding. 

V. Uncoded Strategies 



A quick comparison of ( 56 1 and ( 36 1 reveals that our 



wireless channel model has an input-output relationship that 
is nearly the same as our desired network coding operation. 
Specifically, both the wireless channel and random linear 
network coding output a linear combination of their inputs, 
with the coefficients generated randomly according to some 
distribution. However, the wireless channel exhibits two key 
differences: 

1) It operates over the complex field instead of a finite field. 

2) The receiver only observes a noisy version of the linear 
combination. 

Our goal, for the remainder of this paper, is to show that by 
using appropriate modulation and coding techniques we can 
exploit the wireless medium for reliable network coding over 
a finite field. In this section, we will see how much is already 
possible using uncoded modulation strategies. In other words, 
we will try to map the complex field into the finite field but 
we wiU ignore the effects of the noise. 

A. Finite Constellations 

The most intuitive physical layer network coding strategy 
is to have the users transmit their message bits directly on 
the wireless channel. If the channel gains are equal, then the 
receiver will get the noisy sum of the bits from which it 
can make an estimate of its desired modulo-2 sum. To the 
best of our understanding, this key idea was independently 
and concurrently proposed by three research groups in 2006: 
Zhang, Liew, and Lam |53j, Popovski and Yomo |54|, and 
ourselves |j45|. In their paper, Zhang, Liew, and Lam also 
coined the term physical layer network coding. Here, we 
examine this strategy in the context of the two-way relay 
channel. 

Consider the two-way relay channel in Figure |2] and assume 
that both users transmit simultaneously. Ideally, we would 
like the channel to directly compute the mod-2 sum of the 
transmitted bits. The relay could then broadcast the sum back 
to the users and complete the entire exchange in only two time 
slots. Of course, the channel does not output the mod-2 sum 
so we will have to be a bit more clever For now, assume that 
each user knows its channel gain to the relay and can invert 
it. The relay therefore sees the noisy sum of the transmitted 
signals and we can consider the real and imaginary parts of 



the channel separately. The real part of the signal observed at 
the relay at time t is 



(40) 



where Xi [t] is the real-valued symbol transmitted by user £ at 
time t and Zf^[t] is Gaussian noise with variance a^. We now 
examine a simple strategy for transmitting a noisy modulo-2 
sum of the bits to the relay over the real part of the channel. 
The same scheme can be applied to the imaginary part. 

For ease of analysis, assume that the total power per channel 
use is P = 2 which means that we can allocate one unit of 
power to the real part and one unit to the imaginary part|^Let 
We denote the bit from user £. Each user maps its bit to a 
channel input symbol using binary phase-shift keying (BPSK) 



Xf 



1 



We 



L 



-1 We = 0. 



(41) 



Therefore, if the mod-2 sum of the bits U = W\ ® W2 is 1, 
then the sum of the transmitted signals X1+X2 = 0. Similarly, 
if U is 0, then Xi + X2 is either 2 or —2, depending on the 
original bits. We would like to design a decoding rule for the 
delay to make an estimate U of the mod-2 sum U from its 
noisy observed sum Y. For simplicity, we assume that each 
user's bit is generated from a fair coin toss. The maximum a 
posteriori (MAP) rule to minimize the probability that U is in 
error is given by: 



C4iAP = argmax/Y(y|/7 
6=0.1 



b) Pi-{U = b) 



(42) 



where fviulU = b) is the conditional probability distribution 
of the channel output given the mod-2 sum. Since the noise 
is Gaussian, the probability density function is 



(43) 



\/27r(T 
2V2 



1 ^-y'/2a^ 



[/ = 0. 



Note that for [/ = 0, Yr follows a Gaussian mixture dis- 
tribution since Xi + X2 can be either 2 or —2 with equal 
probability. A bit of calculation reveals that the MAP rule is 
just a threshold on the magnitude of the received signal 



1 IFrI < l + (a2ln2)/2, 
otherwise. 



(44) 



and the error probabilities can be computed using the Q- 
function. Applying this strategy to every bit allows the relay 
to obtain a corrupted version of the modulo-2 sum of the bit 
strings. 



u = Wi 



W2 



(45) 



where e is the error vector. In Figure |9j we illustrate how this 
can enable more efficient communication over the two-way 
relay channel. In one time slot, both transmitters send their 
messages concurrently, giving the relay the noisy modulo-2 
sum. In the next time slot, the relay broadcasts the modulo-2 

"^Note that we can model any signal-to-noise ratio by changing the noise 
variance. 
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sum to the users, which can then solve for a corrupted version 
of their desired bits. CoupHng this strategy with an end-to-end 
error correcting code allows the users to successfully exchange 
messages using just 2 time slots. While this coarse analysis 
seems to favor uncoded transmission, we must account for the 
level of error correction needed to recover from the errors 
introduced at the relay. In Section |VII| we compare the 
performance of this strategy to other strategies. 




(b) 



Wi 



wi ® W2 ffi e 



W2 



Fig. 9. A physical layer network coding strategy for the two-way relay 
channel that requires 2 time slots, (a) During the first time slot, both users 
transmit their messages which gives the relay access to a noisy sum of the 
packets, wi © W2 © e. (b) During the second time slot, the relay broadcasts 
this corrupted sum of the messages back to both users. They use knowledge 
of their own message to infer a corrupted version of the other user's message. 
With an end-to-end error connecting code, this scheme can be used for reliably 
exchanging information. 

There has been a great deal of interest in the idea behind this 
example and many powerful extensions and generalizations 
have been developed for fading channels |55|-|59|, asyn- 
chronous scenarios |[60[-| 62 1, code division multiple-access 
|[63|, and practical deployments (164). See |65| for a survey. 



As mentioned earlier, the signal observed at the relay can be 
treated as the output of a noisy modulo-2 adder so the relay 
can attempt to denoise the modulo-2 sum if each transmitter 
employs the same linear code |45|, |54|, |66|. In practice, 
this can be accomplished using low-complexity codes such 
as fountain codes | |67) , repeat-accumulate codes | [68] , or low- 
density parity-check codes f69]. In larger networks, errors can 
also be left uncorrected and dealt with using the end-to-end 



network error correction framework proposed in 1 70 1 



B. Analog Signaling 

Instead of mapping the complex-valued output of the wire- 
less channel into a finite field, the linear network coding frame- 
work can be modified to operate directly in the complex field. 
This can result in significant performance gains as the desired 
linear combinations are identical to the operation performed 
by the channel, ignoring the noise. Thus, if the signal-to-noise- 
ratio (SNR) is sufficiently high, this "analog" strategy should 
perform quite well. This approach, often called amplify-and- 
forward |[7T|-||75), was proposed for two-way relaying in 2006 
by Popovski and Yomo f76 1 as well as Rankov and Wittneben 
[ [77) . Here, we examine how this strategy can be applied to 
the two-way relay channel. 

The channel from the users to the relay is given by 



Y^[t\=Xi[t]+X2[t\ + Z^[t] 



(46) 



where Xt[t\ is the complex symbol transmitted by user £ at 
time t and Z^[t\ is circularly symmetric complex Gaussian 
noise with variance a^. Recall that each user must meet its 
power constraint, - X^tLi I^^WP — P- Each user encodes 
its message into a codeword using a capacity-achieving 



code for a single user Gaussian channel as in Section IV-A 



The symbols of are i.i.d. according to a Gaussian dis- 
tribution with variance P. If we assume the messages are 
independent, then the codewords are also independent and the 
observed vector at the relay yr is the sum of independent 
Gaussian vectors. The variance of Yr[<] is 2P+(t^. As desired, 
the relay now has a noisy sum of the transmitted signals which 
it can broadcast back to the users. The channels from the relay 
to users 1 and 2 are 



Y^[t]=X^[t]+Z^[t] 

Y2[t]=X^[t]+Z2\t] 



(47) 
(48) 



with X^[t] as the symbol transmitted by relay at time t and 
Zi\t\ ~ CA/'(0, cr^). We assume that equal amounts of time are 
devoted to sending and receiving. The relay simply retransmits 
its noisy sum Y-g\t], scaled to meet the power constraint. 



X.[t]^,l,^Y.[t]. 



(49) 



Each user then observes an even noisier version of the sum, 

Yi[t] = {Xi[t] + X2[t] + Z^[t] 

Y2[t] = 



P 

2P + cr2 



X^[t\+X2[t\ + Z^[t] 



Zi[t] (50) 
Z2[th (51) 



from which it can subtract its own signal and obtain a 
corrupted version of the signal transmitted by the other user 
The SNR of the resulting channel is 

P 



P 



3P 



(52) 



which means that each user can (theoretically) sustain a rate 
up to 



^ANALOG — 2 log ( 1 



p 



p 



3P 



(53) 



bits per channel use while keeping the probability of error 
arbitrarily small. Note that the factor of ^ comes from using 
one time slot to communicate to the relay and another to 
communicate back to the users. At high SNR, this is quite 
close to the ideal performance which is the rate achievable 
by one user communicating via the relay as if the other was 
silent. This rate is 



-RUPPER = ^ log ( 1 



P 
7/2 



(54) 



bits per channel use and can serve as an upper bound on our 
schemes. 

For comparison, a routing strategy requires 4 time slots (as 



discussed in Section II-A i and can only achieve a rate of 



-Rrouting — ^ log ( 1 



p 

7/2 



(55) 
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bits per channel use. If the relay performs network coding 
on its received packets, then 3 time slots are required and an 
achievable rate of 

P 



R 



■NETCOD 



- log 

3 ^ 



1 



(56) 



bits per channel use is possible]^ 

In Section |VII| the rate curves of these schemes as well as 
the BPSK scheme in Section IV-AI and the lattice scheme in 
Section [VI-B I are plotted and compared. 

In general, the wireless channel may be subject to fading 
and the users and relay may have more than one antenna. This 
scenario has been studied in detail in the literature |[78)-|[83). 
For larger networks, the approach remains the same: each relay 
scales and retransmits its observation |84|. However, the noise 
build ups with each retransmission so the SNR requirement 
increases as the network grows f75]. We also note that, in 
some scenarios, it is advantageous to use the compress-and- 
forward framework from p5] instead of amplify-and-forward. 



C. Cross-Layer Design 

While analog network coding presents clear benefits in 
terms of end-to-end throughput, these seem to come at an 
architectural price. Today's wireless networks use a layered 
architecture | ,86J (often referred to as the network protocol 
stack) that separates wireless signaling schemes from flow 
control, scheduling, and information contents. In order to 
make this decoupling possible, the physical layer employs 
error-correcting codes to lower the raw error probabilities, 
and on top of that, an acknowledgment feedback process 
(often expressed as ACK/NACK) is used according to which 
a transmitter repeats a certain packet until it receives an 
acknowledgment signal from the receiver Higher layers then 
rely on an error-free transmission of a certain capacity. Indeed, 
this capacity limitation is the only way in which higher 
layers are aware of the underlying physical reality. With 
analog network coding, it is not possible to enable this. 
Here, intermediate nodes will forward erroneous packets (or 
functions thereof), and there is no way for an intermediate 
node to tell whether it is forwarding something useful or not. 
For reUable communication, additional error correction has 
to be implemented end-to-end. This may be acceptable for 
small networks (such as the two-way relay channel example), 
but it will prevent a layered architecture for larger networks, 
instead requiring cross-layer design. By this, we mean that 
higher layers have to take into account a detailed description 
of the physical reality of the communications medium, beyond 
simple capacity figures. Kawadia and Kumar argued that the 
benefits of cross-layer design may be offset by the loss of 
robustness and modularity in p7|. 



In Sections VI and VIII we suggest a framework for reliable 
physical layer network coding across wireless links. This 
framework can be implemented in a completely modular fash- 
ion since each receiver can reliably decode a linear function 
of the transmitted bits. Standard acknowledgment feedback 



protocols can be employed in this scenario, as well as novel 
feedback protocols developed with network coding in mind 
|88|. Network layer protocols can be built around state-of- 
the-art wired network coding algorithms. This framework has 
shown promise in several theoretical studies but much work 
remains to show that it can be successfully adapted to practice. 

VI. Reliable Physical Layer Network Coding 

The framework developed in Section [III] is able to protect 
linear combinations of codewords against noise. However, the 
tools seem to only be a natural fit for channels that can 
be written as a noisy operation over a finite field. While 
this behavior can be mimicked on a wireless channel by 
using appropriate hard decision rules (as in Section |V-A| i, this 
approach does not, in general, lead to the best performance. 
In this section, we describe a principled approach to wireless 
channels with equal channel gains. Ultimately, this approach 
will enable us to build a digital interface for physical layer 
network coding over channels with arbitrary gains. 

A. Nested Lattice Codes 

A natural starting point is to find a capacity-achieving code 
for the Gaussian channel with a linear structure. Specifically, 
we would like a good code over the reals such that the sum 
of any two codewords is itself a codeword. It turns out that 
this matches the definition of a lattice. A lattice A is a set 
of real-valued vectors (or points in ]R") such that for any two 
elements Ai, A2 e A we have that Ai + A2 G A|^For example, 
one simple lattice is just the set of all integers, Z. Of course, 
a lattice, by itself, cannot be used as a codebook as it contains 
an infinite number of points and clearly violates the power 
constraint. A lattice code is usually constructed by carving 
out a portion of a lattice and designating the selected points 
as codewords. Several such constructions have been proposed 
that can achieve the capacity of a Gaussian channel (see, for 
instance, ||89l-||98|). 

The lattice code should also obey some form of modulo 
arithmetic so that we can map between the linear combination 
taken by the channel and our desired linear combination over 
the messages. One elegant solution is to employ the capacity- 
achieving nested lattice codes developed by Erez and Zamir 
1961 . If ^ lattice A is a subset of another lattice Apine, A C 
ApiNE, then A is said to be nested in Afine- In this nested 
lattice pair, A is often referred to as the coarse lattice and 
Afine as the fine lattice. One simple example is the integer 
lattice Z taken together with the integer multiples of q, qL. 
Let (5( ) be a quantizer that maps vectors to the nearest lattice 
point in Euclidean distance: 



)a(x) = argmin||x- A|| 
AeA 



(57) 



The set of points that quantize to a given lattice point are 
called the Voronoi region of that point. The Voronoi region 



^If we vary the ratio of time spent sending to receiving, then it is 
theoretically possible to reach slightly higher rates. 



^We will only employ lattices that contain the zero vector so that if 
AeA then -A e A as well. 
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for the zero vector is referred to as the fundamental Voronoi 
region, 

Va - {x : Qa(x) = 0} . (58) 

Note that, for a lattice, each Voronoi region is just a translation 
of the fundamental Voronoi region. 

A nested lattice code L is the set of points of the fine lattice 
that lie within the fundamental Voronoi region Va of the coarse 
lattice, £ = Afine H Va. For our example, with the integers 
and integers multiplied by q, the resulting nested lattice code is 
just {0, 1, 2, . . . , g — 1}. Erez and Zamir's nested lattice codes 
have three key properties that make them ideally suited for 
reliable physical layer network coding: 

1) As the dimension (or blocklength) increases, the fun- 
damental Voronoi region of the coarse lattice becomes 
more spherical, which means that it is appropriate for 
guaranteeing the power constraint. 

2) As the dimension increases, the Voronoi regions of the 
fine lattice also become more spherical, which means 
that they afford good protection against Gaussian noise. 

3) The underlying lattices are built from codes over a finite 
field, which means that finite field messages can be 
mapped onto the codebook and back without sacrificing 
linearity. 

For more details, see the excellent overview by Zamir, Shamai, 
and Erez ||95) as well as ||96), |j97), |j99). 

The modulo operation for a nested lattice code is defined 
to be the quantization error, 

[x] mod A = X - (3a(x), (59) 

and can be shown to satisfy the distributive property. 



Xi] mod A + X2 mod A = [xi + X2] mod A, 



(60) 



and the commutative property with respect to quantization onto 
the fine lattice, 

[Qafine ([x] mod A)] mod A = [Qahne (x)] mod A. (61) 

The encoding and decoding algorithms for the Erez-Zamir 
scheme are easy to describe. First, the encoder maps its 
message w onto a point in the nested lattice code x e £ 
and sends it across the channel. The decoder observes the 
transmitted codeword in noise y = x + z and makes the 
following estimate: 



X = [Q Afine ("y)] mod A. 



(62) 



In words, the decoder first scales its observed vector by a, 
quantizes onto the fine lattice, and then takes the modulus (to 
ensure that the decoded fine lattice point is really in the nested 
code). It can be shown that (for long enough blocklengths ri) 
the estimate x is equal to the transmitted codeword x with 
high probability so long as the rate is at most 

2 (^Veffec) 
where we define the effective noise variance to be 

iVEFFEC = -||ay-x||2. (64) 



(63) 



Taking a = 1, we get that the effective noise variance is 
just a^, the variance of the channel noise. Unfortunately, this 
does not take us all the way to the channel capacity, only to 
\ log2 (i?)- Ei"^^ ^ii'l Zamir showed that to reach capacity, 
it is crucial that the channel observation be scaled by the 
minimum-mean squared error (MMSE) coefficient, a ~ p^„n 
(see | |100| for an in-depth discussion of the MMSE scaling). 
Now, the effective noise variance is 



A^EFFEC = -||Q!(x + z) - x| 

n 



1 



{a — l)x + az|| 



2_2 



(65) 

(66) 
(67) 

(68) 



where the second to last step is, roughly speaking, due to 
the fact that capacity-achieving codes have codewords that 
look like i.i.d. Gaussian vectorsF] Plugging this into the rate 
expression in (63 1 yields \ logj (1 + Note that here we 



have only dealt with a real-valued Gaussian channel. Repeating 
the same scheme over the imaginary part of the channel yields 
the full capacity of a complex-valued Gaussian channel. 

This scheme can be adapted to the two-way relay channel 
in a straightforward fashion. Before doing so, we mention 
some guidelines for employing nested lattice codes in practice. 
These guidelines are drawn from | [95) , |j96), ]10r| and we refer 
interested readers to these works for more detail. First, the 
coarse lattice can simply be taken to be integers multiplied by 
the field size A = gZ". This reduces quantization onto the 
coarse lattice to rounding each element of the received vector 
to the nearest multiple of q. The cost is the shaping gain, 
which is at most 0.509 bits per complex channel symbol (see 
|96|). Next, the fine lattice Afine can be the codewords of any 
linear code over the finite field of size q. Any popular low- 
complexity linear code can be employed, such as a low-density 
parity check (LDPC) code. Finally, the fine lattice quantizer 
can be replaced with an appropriate low-complexity decoding 
algorithm (written as DECODER below). The cost of replacing 
the fine lattice with a low-complexity code is just the gap 
to capacity for this code. Mathematically, the encoder just 
becomes multiplication by the generator matrix, x = Gw. 



The decoding operation in ( 62 1 simplifies to 



X = [DECODER(Q;y)] mod grZ". 



(69) 



Thus, through the use of modern low-complexity codes, nested 
lattice codes can be implemented in a practically feasible 
manner. See |101| for more details. In Section IX we give an 



overview of some very recent efforts to build low-complexity 
codes that are specifically tailored for reliable physical layer 
network coding. 



'The observant reader will have noticed that part of the noise comes from 
the codeword itself. Proving that this "self-noise" is only as bad as additional 
Gaussian noise is technically subtle and usually involves the use of dithering 
vectors that are removed prior to decoding. This analysis is beyond the scope 
of this survey and we refer curious readers 1961 for more details. 
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B. Equal Channel Gains 

We now show that the relay in the Gaussian two-way relay 
channel can recover the sum of the transmitted codewords 
modulo the coarse lattice, xsum = [xi + X2] mod A. This 
nested lattice scheme was proposed by Narayanan, Wilson, 
and Sprintson for the two-way relay channel in 2007 p02| , 
P03[ . Concurrently, we proposed a lattice framework for 
general Gaussian multiple-access networks that attains the 
same performance for the two-way relay channel |38|, p04) . 
Subsequently, Nam, Chung, and Lee proposed a nested lattice 
scheme for unequal (but known) channel gains 1 105| , 1 106| . 

Each user encodes its message onto a codeword from a good 
nested lattice code. Recall that the relay observes the sum of 
the channel inputs plus Gaussian noise, yR = xi + X2 + zr. 
The relay uses the decoding rule 



Xsum — [VAhne 

= [QAfine 



(ayR)] mod A (70) 
([ayR] mod A)] mod A (71) 



where the last line is due to the commutative property of 
the modulo operation. We now show that the term inside the 
quantization operator is just the desired sum xsum plus some 
noise, 

[ayR] mod A (72) 
= [a(xi + X2 + Zr)] mod A (73) 
= [xi + X2 + (a - l)(xi + X2) + ckzr] mod A (74) 
= [[xi + X2] mod A + (a - l)(xi + X2) + qzr] mod A 
= [xsum + (a - l)(xi + X2) + qzr] mod A, (75) 

where the second-to-last step follows by the distributive prop- 
erty of the modulo operation. Therefore, the effective noise 
variance is 



1 



A^EFFEC = - l)(xi +X2) +azR| 



= {a-lf2P 



2 2 
a a 



(76) 
(77) 



where the last line is due to the fact that xi and X2 are 
independent vectors which (nearly) look as if drawn from an 
i.i.d. Gaussian distribution with power P. The optimal scaling 
coefficient is the MMSE coefficient a — 2p+a^ ^^'^ plugging 
this into ( 63 1 yields a rate of ^ log ( \ 



2_P+o-2 

-^). If this scheme 



is used on both the real and imaginary dimensions, the relay 
can decode the sum at any rate up to 



i?, 



COMP 



log2 ( \ 



P 



(78) 



In Figure 10 the nested lattice scheme described above is 



illustrated for the special case of a = 1. Each user transmits 
a vector from the nested lattice code. The channel outputs the 
noisy sum, which is observed at the receiver Note that the 
sum of the two vectors exceeds the boundary of the original 
nested lattice code. However, by decoding to the closest fine 
lattice point and then taking the modulo operation, the modulo 
sum of the codewords can be recovered. 

Recall that the ultimate goal is for the relay to reliably 
decode the modulo sum of the original messages, not the code- 
words. While this is not necessarily possible for any nested 



lattice code, the Erez-Zamir codes used here are constructed 
using a linear code over a finite field. It can therefore be shown 
that there exists a mapping from the finite field message 
vectors onto the nested lattice code that preserves linearity. 
See Lemma 6 in our recent paper for a construction of (j) 
| [T07j . Mathematically, this property can be expressed as 

Encoding: x^ = (/'(w^) (79) 
Decoding: ^ [aiXi + 02X2 + • • • -I- a^Xi] mod A^ 

= oiwi a2W2 ® • • • ® a^WL . (80) 

For the low-complexity case where the coarse lattice is gZ" 
and the fine lattice is a linear code, ((> is just the generator 
matrix G of the linear code and is its inverse. 

This mapping is the last piece of the puzzle. With it, the sum 
of the messages can be recovered directly from the modulo 
sum of the codewords. 



[xi + X2] mod A) = wi © W2 



(81) 



Now, we can use this in a two-way communication scheme 
by using one time slot to transmit the sum of the messages to 
the relay and another to send it back to the users. It follows 
that the users can exchange messages at any rate up to 



^LATTICE = 7, log 2 



1 F 

2+^ 



(82) 



This rate nearly matches the upper bound in ([54| except for a 



missing i inside the logarithm 



This two-way lattice scheme has been extensively studied 
and generalized in the literature. These extensions include 
unequal channel gains p06) , ]108) , non-Gaussian channel 
models |109), secret messages pi 101, private messages fl 1 1], 



direct links 1 112| , as well as more than two transmitters |48|, 
1 1 13| , |114|. Gupta-Kumar style scaling laws 1 1 15| have also 
been derived for this lattice scheme |ll 16|. We also note that 
similar lattice-based schemes can increase achievable rates in 
interference channels [117| , | |118| . 

Overall, this nested lattice scheme can be used as a digital 
framework for physical layer network coding on the wireless 
channel. It is able to exploit the addition performed by the 
channel while preserving modulo arithmetic and protecting 
against Gaussian noise. In a larger network, each relay will 
recover a linear combination of the original messages. It can 
then transmit this linear combination as its own message, just 
as relays in wireline networks send out linear combinations of 



their received messages. In Section VIII we will generalize the 



results in this section to unequal channel gains. Furthermore, 
we show that the transmitters do not even need to know the 
channel gains, which means that this scheme can be applied 
to fading channels and scenarios with more than one receiver 
In the next section, we plot the performance of each scheme 
discussed so far for the Gaussian two-way relay channel. 



Several groups have unsuccessfully tried to find a lattice scheme that can 
attain the upper bound. This remains an open problem. 
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Wl 



W2 




Wl ® W2 



Fig. 10. Each transmitter maps its finite field message into an element of the nested lattice code and sends this vector on the channel. Here, the channel 
coefficients are taken to be equal, /ii, /12 = 1. Therefore, the receiver observes a noisy sum of the transmitted vectors and determines the closest lattice point. 
After taking a modulo operation with respect to the coarse lattice, the receiver can invert the mapping and determine the modulo sum of the original messages. 



VII. Performance Comparison 



to a limiting slope of 1/4. 



In Figure 11 we compare the performance for the various 



network coding strategies discussed in the present paper, for 
the particular case of a Gaussian two-way relay channel. The 
figure displays the rate per user in bits per channel use, as 
a function of the transmit power per user, while the noise 
is assumed to be of unit variance. Starting from the top, the 



figure shows the simple upper bound given in Equation (54 1. It 
is instructive to consider the behavior at large transmit power 
P, characterized by the limit of the ratio i?/log(l + P/a"^). 
For the upper bound, it is clear that this limiting slope is 1/2. 

The next curve, labeled "Lattice," is the performance of 
reliable physical layer network coding via lattice codes, given 



in Equation ( 82 1. We note that this scheme is close to the upper 



bound and that it attains the same limiting slope of 1/2. 
The following curve, labeled "Analog," represents the ana- 



log network coding scheme discussed in Section V-B It 



follows the upper bound but never meets it. This is because 
the noise observed at the relay is sent along with the desired 
signal, an effect that would be even more detrimental if there 
were further stages in the network. In the limit of high transmit 
power P, this effect becomes negligible and the optimal 
limiting slope of 1/2 is attained. 

The curve labeled "Netcod" is the performance attained 
by the wireless broadcast network coding scheme in Section 
II-A depicted in Figure |4] Each user takes a turn sending its 
message to the relay and the relay sends the mod-2 sum back 
to the users. This scheme loses out at high transmit power 
due to the fact that three channel uses are needed for each 
exchange, making for a limiting slope of 1/3. 

The curve labeled "Routing" is the scheme in Section [II- A| 
depicted in Figure [3] Each user takes a turn sending its 
message to the relay and the relay sends these back to the 
users. At high transmit power, one can verify that this leads 




5 10 15 

Transmitter Power in dB 



20 



Fig. 11. A performance compaiison of the schemes for the two-way relay 
channel discussed in this paper. 



The final curve, labeled "BPSK," is the binary scheme 



in Section V-A according to which each user transmits its 
bits uncoded and the relay makes a hard decision about the 
modulo-2 sum. The broadcast phase is abstracted as a bit 
pipe simultaneously to both users with a rate corresponding 
to the capacity of the broadcast channel from the relay to the 
users (which thus depends on the transmit power P). Error- 
correcting codes are then used end-to-end. Note that due to 
the fact that BPSK is used, this scheme plateaus at 1 bit per 
channel use; if a larger constellation were used, this plateau 
effect would be higher (but the performance at low SNR might 
suffer). 

As a final caveat, we note that some of the achievable 
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schemes can be slightly improved by optimizing the ratio of 
time spent in the different phases. For the present figure, it is 
assumed that all these phases are of the same length, as in the 



descriptions provided in Section II-A 



VIII. Fading Channels 

If the channel simply outputs a noisy sum of the transmitted 
signals, then it is intuitive that this can be exploited for 
adding up the messages. Yet, in general, the channel output 
will be some hnear combination according to complex-valued 
coefficients and it is not immediately clear that this will be a 
good match for network coding over a finite field. Here, we 
demonstrate how to overcome this obstacle using the compute- 
and-forward framework we proposed in fWT]. First, we show 
how to reliably compute over real-valued channels and then 
we use this scheme as a building block for complex-valued 
channels. 

A. Real-Valued Channels 

Consider a real-valued channel whose output vector is just a 
lineal" function of the transmitted vectors plus some Gaussian 
noise. 



y = /iixi + /12X2 



IlL^L + z. 



(83) 



Assume that each user selects and transmits a point from a 
nested lattice code. The key idea is that, instead of trying to 
decode the sum, the receiver should aim to decode an integer 
combination of the codewords (modulo the coarse lattice). 



V = [fliXi + a2X2 



a^Xi] mod A. 



(84) 



This integer combination is itself a codeword, due to the linear 
structure of the nested lattice code, and is therefore afforded 
protection against noise. If these integer coefficients are close 
enough to the real-valued coefficients of the channel, then 
it seems plausible that the receiver can decode the function 
successfully. More precisely, the receiver makes the following 
estimate of v: 

V = [QA™E(ay)] mod A (85) 

= [<3AF,NE([ay] mod A)] mod A (86) 

where the second line is due to the commutative property. Prior 
to quantization onto the fine lattice, this is just the desired 
function v plus some noise: 

[ay] mod A (87) 

= [a{hixi H h /ilXl + z)] mod A (88) 

[aixi + • • • + flLXL • • • (89) 
• • • + (ahi — ai)xi + • • • + (a/iL — aL)'^L + Q^z] mod A 
= [v + [ahi — ai)xi + • • • [ah^ — aL)^L + ctz] mod A. 

The effective noise comes from both the non-integer part of the 
channel and the Gaussian noise. The effective noise variance 



IS 



1 



EFFEC 



az + {ahi — ai)xi 



2 2 
a a 



+ {ahL - aL)xL|P 
(90) 



This means that the receiver can recover the integer combi- 
nation of codewords so long as the rate of the nested lattice 
code is at most 



R 



1 



COMP 



log2 



P 



(91) 



Again, it can be shown that the optimal a is the MMSE 
coefficient. 



ttMMSE — 



(92) 



Here, the role of a can be thought of as trying to move 
the channel coefficients towards integers. For instance, if the 
channel coefficients are h\ = 0.5 and h2 = 0.5, then choosing 
a — 2 converts the channel into a noisy adder (like that studied 



in Section VI-B 1 at the price of quadrupling the noise variance. 
Substituting q;mmse into the rate expression simplifies it down 
to 

i?COMP = ^ l0g2 ^ (^Yl ~ "MMSE ^ hgai^ ^ (93) 

Finally, the recovered integer combination of the codewords 
can be mapped to a modulo-g linear combination of the 
messages. 



(j) ^(v) = aiwi a2W2 ' 



Wi . 



W2H Enc 




Zl 

I yi 

Z2 



y2 



J: Ym ( N \ - 



(94) 



Fig. 12. In a wireless network, each receiver is free to decode the linear 
combination of messages that best fits its observed charmel coefficients. 

Therefore, this channel model can be exploited to reliably 
compute linear functions over a finite field. We now draw 
attention to the fact that the encoding function does not depend 
on the channel coefficients: the transmitters just send out their 
nested lattice point. This implies that even if the channel 
coefficients are completely unknown to the transmitters, this 
scheme can still be used. (Clearly, the receiver needs to know 
the channel coefficients so it can choose the integer coefficients 
appropriately and determine the optimal a.) Thus, if multiple 
receivers with different channel coefficients all receive signals 
from the same transmitters, each receiver can decode its own 
particular linear combination at a rate as given in ([93|). See 



Figure 12 for an illustration. 

Note that the receiver is not limited to decoding a single 
function. It can run the decoding step on its observed vector 
several times to extract several different functions. As a special 
case, the receiver can recover a single message w™ by setting 
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the associated coefficient to one, a,,, = 1, and all others to 
zero. It can be shown that the rate for this special case is 



1 



(95) 



which corresponds exactly to the rate attainable by treating 
other simultaneous transmissions as undesirable interference. 
Taking this idea a bit further, it can be demonstrated that the 
nested lattice framework can attain any point in the Gaussian 
multiple-access rate region |107|. Thus, it can only help to use 
this nested lattice scheme for decoding functions since, as a 
backup, the receiver can just decode the messages individually 
with no rate penalty whatsoever 

Unlike the finite field strategies considered in Section III 
different coefficients result in different rates. In many scenar- 
ios, the receiver can simply search for the coefficients that 
offer the highest possible rate and then target that equation. 
If the channel coefficients are sufficiently independent, then 
there is a good chance that each receiver will decode a linearly 
independent equation and the end-to-end linear transformation 
between the sources and the destinations will be invertible. In 
some cases, there are benefits to imposing some restrictions 
on the coefficients that can be selected. 

Overall, this strategy is especially useful in networks with 
concurrent transmissions as receivers must cope with inter- 
ference in one way or another. As an example, in Figure 
13 there are three transmitters that simultaneously transmit 
their messages to a single receiver which tries to reliably 
decode a linear function. Note that recovering a single message 
corresponds to recovering a function with a single non-zero 
coefficient. Each fading coefficient is drawn i.i.d. from a 
Gaussian distribution and the fading realization is only known 



to the transmitter. In Figure 14 we compare three different 
strategies for this network. The "Decode an Equation" strategy 
is just the nested lattice scheme discussed in this section, 
whose performance is given by ( [93] l. The "Decode a Message" 
strategy is the rate attainable if the receiver only tries to get 
a single message, not an equation. This curve corresponds 
to the information-theoretic optimum for the special case of 
decoding a single message. Finally, the "Interference as Noise" 
is a suboptimal strategy for decoding one message that treats 
the other two messages as noise. The performance is given by 



B. Complex-Valued Channels 

As mentioned in Section |IV] the baseband representation, 

y = /iixi + /12X2 + • • • + /ilxl + z, (96) 

of a narrowband wireless channel is complex-valued. With a 
few simple modifications, the nested lattice scheme for real- 
valued channels can be applied here as well. Each encoder 
breaks its message W£ into two messages of equal length, 
w^^ and w^"^. The resulting messages are mapped onto nested 
lattice points (using </)) and sent as the real and imaginary parts 
of the transmitted codeword. 




Fig. 13. Three users simultaneously transmit their messages to a relay that 
wishes to decode a linear function, aiwi © 02 W2 © 113W3. In Figure [T4] 
three strategies for this scenario are plotted. 
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■ Decode an Equation 

■ Decode a Message 

■ Interference as Noise 




10 15 
Transmitter Power in dB 



20 



25 



Fig. 14. Average rate for recovering an equation of thi'ee simultaneously 
transmitted messages. The top line corresponds to using a nested lattice 
code to decode an equation. The middle line coiTesponds to the information- 
theoretic optimum for recovering a single message. The bottom line corre- 
sponds to recovering a single message while treating the other two as noise. 
The fading is drawn from a Gaussian distribution and is only known to the 



The decoder deals with the real and imaginary parts of the 
received signal separately. The real part is 

L 

Re(y) - Re(/»«)Re(x^) - lm(/if )lm(xf ) + Re(z) (98) 

1=1 

which can be treated like the received signal from a real- 
valued channel with 2L transmitters. From this, the receiver 
can decode a linear function of the form 



u 



Re(x,) = <^(wf ) 



lm(x£) = (t>{vf^^ 



(97) 



wf - a^^T mod q. (99) 

J=i 

Similarly, the imaginary part is 

L 

lm(y) = Y lm(/i£)Re(x^) + Re(/i^)lm(x£) + lm(z) (100) 
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from which the receiver recovers the complementary linear 
function 



u 



Im 



.1=1 



lm„ Re I Re„ In 

u w« + an w« 



mod q. (101) 



In p07j , we showed that the effective noise variance is always 
the same for real and imaginary received signals and the 
computation rate 



Rct 



P 



;0MP 



log2 f 



bits per channel use is achievable. If a destination is given 
several of these equations, it can infer the original messages 
as long as the matrix of complex integer coefficients 



,Re 
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-,Re 
'21 



-.Re 



j47 



^12 
-.Re 
^22 



■ja'i^ 



-,Re 
'IL 
-,Re 
'2L 



,Re 



'■Ml J^Af 1 '^A/2 + 30-M2 ' ' ' ^ML + J'^ML 

is full rank. This scheme is studied in detail in as well as 
in the first author's PhD thesis |42|. One interesting extension 
is to use multiple antennas at the receiver to steer the channel 
coefficients towards integer values, thus increasing the rate at 
which an equation can be recovered \\ 19[. 



IX. Code Constructions 

The above analysis shows that, for sufficiently large block- 
lengths, there exist encoders and decoders that can efficiently 
harness the interference property of the wireless medium for 
network coding. While we have chosen codebooks with a 
linear structure, this does not, by itself, imply they can be 
implemented with low complexity, especially on the decoder 
side. This is similar to the classical channel coding problem 
where the capacity gives the ultimate performance limit and 
one then attempts to design practically realizable codes that 
can approach this limit. Very recently, several groups have 
proposed practical coding schemes that are designed with 
reliable physical layer network coding in mind. We briefly 
describe three of these schemes below. 

In p20| , Feng, Silva, and Kschischang take an algebraic ap- 
proach and propose a set of lattice partitions whose properties 
make them appealing for physical layer network coding. For 
instance, their nested lattice construction attains a field size of 
q^ for the same complexity it takes the basic scheme to attain 
a field size of q. Through simulations, they also show that 
their framework works quite well for blocklengths as small as 
100. 

The basic compute-and-forward framework employs con- 
stellations and codes over a prime-sized finite field to ensure 
a match between the linear combinations taken by the channel 
and the linear combinations over the codewords. Unfortu- 
nately, the implementation complexity of a coding scheme 
increases with the characteristic!^ of the finite field. Therefore, 

'The size of any finite field can be written as q^"^ wliere g is a prime and 
K is a positive integer. Usually, q is referred to as the characteristic of the 
finite field. 



in practice, it is desirable to have constellations of size 2^ 
for some positive integer K. With an appropriate mapping, 
these constellations can be coupled with binary linear codes 
to significantly reduce the implementation complexity. The 
main difficulty is that the mapping from constellation points 
to codeword symbols must preserve the interference structure 
of the channel. To overcome this issue. Hern and Narayanan 
proposed using multilevel codes and allowing the receiver 
to decode to a larger class of functions |121|. Specifically, 
each constellation symbol is specified by a block of K bits. 
Different coefficients are allowed for each of these bits but 
are constant from symbol to symbol. This allows the receiver 
more freedom in mapping the channel operation to something 
useful over the finite field. Independently, Ordentlich and Erez 



have developed a framework |122| that uses mapping by set 
partitioning to go from binary codewords to higher order 
constellations. Their preliminary simulations have shown that 
this constellation mapping in conjuction with an LDPC code 
can perform quite well. 



X. Larger Networks 

In the context of a larger network, what does reliable 
physical layer network coding mean for the overall network 
code? Network codes are usually designed over networks of bit 
pipes and the relays are therefore free to select any coefficients 
for their linear combinations that maximize the end-to-end 
throughput. However, in this new framework, the rate at which 
a particular linear combination can be decoded by a relay 
hinges on the match between the desired coefficients and 
the fading coefficients of the wireless channel. Relays must 
choose function coefficients that balance between maximizing 
the local computation rate and the end-to-end throughput. For 
instance, in the two-way relay channel with fading, the relay 
can recover any equation aiWi a2W2 so long as both ai 
and 02 are non-zero. Otherwise, at least one of the users will 
not receive any novel information. 

For a multi-stage relay network, selecting the optimal 
function coefficients corresponds to an integer program if we 
are given access to the full channel state information. Yet, 
in many cases, nodes will only have access to some subset 
of the channel state, most likely just that of their immedi- 
ate neighbors ]123| . For this scenario, new algorithms and 
heuristics are needed to enable distributed coefficient selection. 
More research is also needed to elucidate how much of an 
advantage, in terms of end-to-end throughput, is offered by 
reliable physical layer network coding over realistic wireless 
network topologies. One preliminary theoretical analysis is 
developed in 1 124|, where it is assumed that any node receives 
the modulo-2 sum of the packets transmitted by the nearest- 
neighbor transmitters. This situation is compared to the case 
where nodes can receive full packets from neighboring nodes, 
but at an appropriately lower rate. In terms of transport 
capacity and for a network of nodes located on a regular 
lattice in two dimensions, it is shown that the former more 
than doubles the capacity of the latter. 
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XI. Conclusions 

Physical layer network coding is an intuitively pleasing en- 
hancement to network coding. In network coding, intermediate 
nodes forward linear combinations of information packets (or 
more general functions of them) and destination nodes collect 
a sufficient number of linearly independent functions so as to 
be able to recover their desired information packets. Physical 
layer network coding starts from the insight that it can be 
much more efficient for a node to directly learn the linear 
function of several packets, rather than having to first learn 
each packet separately and then evaluating the function. This 
is particularly tempting in the case of the wireless medium 
since there, transmitted signals naturally interfere in a linear 
fashion. One approach reviewed in this paper, dubbed analog 
network coding, uses this effect in an uncoded fashion, dealing 
separately (end-to-end) with the accumulating noise. In this 
paper, we particularly emphasized a way to circumvent this 
problem and instead enforce reliable physical layer network 
coding even inside of the network. The main idea underlying 
this framework is that each transmitter should employ the 
same linear code. Through examples, we have illustrated the 
superiority of this approach in a capacity sense. A second 
superiority concerns the resulting overall layered system archi- 
tecture: In reliable physical layer network coding, the physical 
layer can be made essentially transparent to the end-to-end 
communication process. 

Systems are currently being designed to determine to what 
extent the promised performance gains are attainable over the 
real wireless medium, but much of this exciting path still lays 
ahead. 
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